A Generalized Model for Multimodal Perception

نویسندگان

Sz-Rung Shiang

Anatole Gershman

Jean Oh

چکیده

In order for autonomous robots and humans to effectively collaborate on a task, robots need to be able to perceive their environments in a way that is accurate and consistent with their human teammates. To develop such cohesive perception, robots further need to be able to digest human teammates’ descriptions of an environment to combine those with what they have perceived through computer vision systems. In this context, we develop a graphical model for fusing object recognition results using two different modalities–computer vision and verbal descriptions. In this paper, we specifically focus on three types of verbal descriptions, namely, egocentric positions, relative positions using a landmark, and numeric constraints. We develop a Conditional Random Fields (CRF) based approach to fuse visual and verbal modalities where we model n-ary relations (or descriptions) as factor functions. We hypothesize that human descriptions of an environment will improve robot’s recognition if the information can be properly fused. To verify our hypothesis, we apply our model to the object recognition problem and evaluate our approach on NYU Depth V2 dataset and Visual Genome dataset. We report the results on sets of experiments demonstrating the significant advantage of multimodal perception, and discuss potential real world applications of our approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A model for distribution centers location-routing problem on a multimodal transportation network with a meta-heuristic solving approach

Nowadays, organizations have to compete with different competitors in regional, national and international levels, so they have to improve their competition capabilities to survive against competitors. Undertaking activities on a global scale requires a proper distribution system which could take advantages of different transportation modes. Accordingly, the present paper addresses a location-r...

متن کامل

Multimodal Psychotherapy in Patients with Multiple Sclerosis (MS)

Objective: The main purpose of this study was to investigate the effectiveness of Lazarus Multimodal Psychotherapy (MMT) on perceived stress in individuals with Multiple Sclerosis (MS). Methods: Through a quasi-experimental design, forty patients in Qazvin city in Iran were selected by convenient sampling and then divided into two groups: experimental and control groups. After group assignme...

متن کامل

Capacitated Multimodal Structure of a Green Supply Chain Network Considering Multiple Objectives

In this paper, a supply chain network design problem is explained which contains environmental concerns in arcs and nodes of network. It is assumed that there are some routes such as road, rail and etc. in each pair of nodes. In this model decision variables are choosing facilities to open, environmental investment level in each facility and flow of products between nodes in each route. A multi...

متن کامل

ارزیابی مدل تنظیم هیجانی اختلال اضطراب فراگیر در تبیین ادراک درد

Pain is the most popular stress that human face with. This study aimed to determine the fitness of emotion regulation model of generalized anxiety disorder to explain  the perception of pain in patients with chronic pain. This study was conducted in the context of a correlation research the type of structural equation. The sample were consisted of 210 patients referred to a specialized pai...

متن کامل

An Analysis-By-Synthesis Approach to Multisensory Object Shape Perception

The world is multimodal.1 We sense our environments using inputs from multiple sensory modalities. Similarly, digital information is increasingly available through multiple media. In this extended abstract, we present a general computational framework for understanding multimodal learning and perception that builds on the analysis-by-synthesis approach [2, 3]. The analysis-by-synthesis approach...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

A Generalized Model for Multimodal Perception

نویسندگان

چکیده

منابع مشابه

A model for distribution centers location-routing problem on a multimodal transportation network with a meta-heuristic solving approach

Multimodal Psychotherapy in Patients with Multiple Sclerosis (MS)

Capacitated Multimodal Structure of a Green Supply Chain Network Considering Multiple Objectives

ارزیابی مدل تنظیم هیجانی اختلال اضطراب فراگیر در تبیین ادراک درد

An Analysis-By-Synthesis Approach to Multisensory Object Shape Perception

عنوان ژورنال:

اشتراک گذاری